Spanish Language Processing at University of Maryland: Building Infrastructure for Multilingual Applications

نویسندگان

  • Clara Cabezas
  • Bonnie Dorr
  • Philip Resnik
چکیده

We describe here our construction of lexical resources, tool creation, building of an aligned parallel corpus, and an approach to automatic treebank creation that we have been developing using Spanish data, based on projection of English syntactic dependency information across a parallel corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Spanish DELPH-IN grammar

In this article we present a Spanish grammar implemented in the Linguistic Knowledge Builder system and grounded in the theoretical framework of Head-driven Phrase Structure Grammar. The grammar is being developed in an international multilingual context, the DELPH-IN Initiative, contributing to an open-source repository of software and linguistic resources for various Natural Language Processi...

متن کامل

Improving Multilingual Catalog Search Services by Means of Multilingual Thesaurus Disambiguation

Multilinguality is an important aspect for the creation of public services in countries like Spain, with four official languages (Spanish, Catalonian, Basque and Galician), and overall, if these services are aimed for a European audience with a big number of official languages. Thus, an initiative for creating a catalog service at the Spanish or at the European level must take into account the ...

متن کامل

How to Add a New Language on the NLP Map: Building Resources and Tools for Languages with Scarce Resources

Those of us whose mother tongue is not English or are curious about applications involving other languages, often find ourselves in the situation where the tools we require are not available. According to recent studies there are about 7200 different languages spoken worldwide – without including variations or dialects – out of which very few have automatic language processing tools and machine...

متن کامل

Rapid Building of an ASR System for Under-Resourced Languages Based on Multilingual Unsupervised Training

This paper presents our work on rapid language adaptation of acoustic models based on multilingual cross-language bootstrapping and unsupervised training. We used Automatic Speech Recognition (ASR) systems in the six source languages English, French, German, Spanish, Bulgarian and Polish to build from scratch an ASR system for Vietnamese, an underresourced language. System building was performe...

متن کامل

FreeLing 2.1: Five Years of Open-source Language Processing Tools

FreeLing is an open-source multilingual language processing library providing a wide range of language analyzers for several languages. It offers text processing and language annotation facilities to natural language processing application developers, simplifying the task of building those applications. FreeLing is customizable and extensible. Developers can use the default linguistic resources...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001